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A main scientific output of the LISA Pathfinder mission is to provide a noise model that can be 
extended to the future gravitational wave observatory, LISA. The success of the mission depends 
thus upon a deep understanding of the instrument, especially the ability to correctly determine the 
parameters of the underlying noise model. In this work we estimate the parameters of a simplified 
model of the LISA Technology Package (LTP) instrument. We describe the LTP by means of 
a closed-loop model that is used to generate the data, both injected signals and noise. Then, 
parameters are estimated using a Bayesian framework and it is shown that this method reaches 
the optimal attainable error, the Cramer-Rao bound. We also address an important issue for the 
mission: how to efficiently combine the results of different experiments to obtain a unique set of 
parameters describing the instrument. 



I. INTRODUCTION 

LISA Pathfinder pQ is an ESA mission, with some 
NASA contributions, that aims at testing key technolo- 
gies for the future space gravitational wave observa- 
tory, LISA [2 . The main aim is to demonstrate the 
ability to put a test mass in to free-fall at a level of 
3 x 10" 14 ms- 2 /VHz at lmHz. The LISA Technology 
Package (LTP) is the main instrument on-board LISA 
Pathfinder. It comprises two test masses enclosed in in- 
ertial sensors which are in turn housed inside individ- 
ual vacuum tanks, composing the so called Gravitational 
Reference Sensor [3]. The two tanks are then mounted 
to a support structure which also holds an optical bench 
between the tanks. The optical bench and the associ- 
ated interferometry are part of the Optical Metrology 
System [4]. In order to reach the goal stated above, the 
full LTP must be characterized and optimized. This will 
involve developing a full parametric noise model of the 
instrument, which will be improved over the course of the 
mission. 

The LISA Pathfinder mission comprises a series of ex- 
periments. Many of the experiments aim to reduce the 
noise in the system so as to produce the quietest residual 
acceleration measurement possible. Other experiments 
will aim to characterize the instrument. This typically 
involves determining the various parameters that go into 
the physical model of the instrument. Clearly, a good 
model is needed to be able to target and reduce par- 
ticular noise sources, whereas reducing the various noise 
sources leads to a more sensitive instrument. Various 
experiments will be repeated under different conditions, 
and as the noise is reduced, we would expect that the de- 
termination of the physical parameters will become more 
and more accurate. One essential aspect of this multiple- 
experiment mission is the ability to include the results 
from analyzing the previous experiments in further ex- 
periments, and in particular, it will be necessary to com- 



bine the various experiments to gain the best knowledge 
about the particular physical parameters. The analysis 
procedures and software need therefore to remain flexible 
in order to react to the results of the experiments as they 
are performed. This paper presents a Bayesian analysis 
for determining particular physical parameters of the sys- 
tem. Using a Bayesian framework leads to a natural way 
of combining a series of experiments. The result of one 
analysis becomes prior information in subsequent analy- 
ses. The analysis is presented for a reduced set of physical 
parameters in the context of the Mock Data Challenges 
(MDC) [5. that are being carried out during the develop- 
ment of the data analysis procedures for the mission. In 
MDC1 6 the focus was on developing a simple model of 
the system, together with establishing routines for cal- 
ibrating the measured test mass displacements back to 
equivalent residual external test mass accelerations. In 
MDC2, the focus shifts to parameter estimation. The 
analysis and procedures presented in this paper represent 
one of the methods being developed for the mission. 



II. THE SECOND LTP MOCK DATA 
CHALLENGE 

The aim of the second MDC was to develop and test 
reliable methods to accurately estimate the parameters 
of the LTP noise model during flight operations. In order 
to focus on methods and not on model complexity, it was 
decided to keep a very similar model as the one analyzed 
during the first MDC. The basic difference regarding the 
previous challenge is that now 5 parameters are consid- 
ered as degrees-of-freedom of the system, which need to 
be determined by stimulating the system using injected 
signals. It is worth recalling that the first MDC did not 
include any signal injection in the data, since it was de- 
signed as a test of the calibration of displacement noise 
to acceleration noise, and therefore only a noise measure- 
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FIG. 1. The LTP MDC2 model. Le/t: Simplified scheme of the LTP instrument. Only two out of the four heterodyne 
interferometers are represented here: the one measuring spacecraft to first test mass distance, o x i, and the one measuring 
test mass to test mass distance, o& x . See text for a description of terms appearing in the picture. Right: The previous is 
described as a control loop: the boxes describe the interferometer (IFO), controllers and dynamics of the test masses. The 
circles represent noise contributions, diamonds are signal injection points and the triangles denote cross-couplings between the 
first (o x i) and second channel (oax)- 



ment (signal free) was simulated. The current challenge 
is therefore a natural extension to the first one. 

It is important to notice that, due to the nature of the 
LISA Pathfinder mission, our description of the system 
necessarily needs to deal with the closed-loop dynamics 
of the spacecraft and test masses together. 

The description that we show in Section |il A| is there- 
fore a closed-loop system where we take into account the 
feedback between different components and show where 
parameters and noise contributions enter in the non- 
linear model that is described in terms of transfer func- 
tions in the frequency domain. We want to recall that 
this approach differs from the one used to model LISA to 
the date. The data generators that are providing data in 
the LISA Mock Data Challenges [7HS] are focused on the 
geometry of the spacecraft configuration, since the main 
concern is, in that case, the suppression of frequency 
noise due to the unequal arms. But, on the other hand, 
they consider additive noise sources inside each space- 
craft. A second important remark is that LISA genera- 
tors model noise sources as white gaussian contributions. 
This is clearly unrealistic and could be particularly mis- 
leading in the relevant region around 1 mHz, since each 
of the noise sources will contribute with a f~ p (p ~ 1) 
power spectrum that will set the low frequency perfor- 
mance of LISA. LTP is designed to study that region 
and therefore our model needs to describe these low fre- 
quency contributions in more detail. The noise models 
and the parameters used are described in Section |II B| 
The description provided in this paper will complement 
the one already existing within the LISA community and 
will facilitate the interaction between both communities 
to a common goal, which is a realistic understanding of 
the LISA instrument. 

In terms of implementation, it is worth mentioning 



that the current challenge is completely implemented as 
LTPDA tools [5] , which means that any user of this tool 
has the means available to produce LTP-like data (as de- 
scribed in the following section) by executing a relatively 
simple MATLAB [10] script. 

A. Dynamical model 

When compared with other space missions, the LTP 
is a very flexible instrument in terms of the possible op- 
erational scenarios. It can be configured to use different 
combinations of the available sensors onboard, either op- 
tical or capacitive, with the aim of performing different 
geodesic measurements, or even to work as an accelerom- 
eter. The aim of the second MDC was not to cover all of 
these possibilities but to analyze the instrument behav- 
ior for a fixed operating mode: the main science mode 
— described as M3 mode in [TT]. Moreover, this control 
scheme is reduced in this analysis to the one-dimensional 
case in order to simplify the model and focus on the anal- 
ysis. In this simplified model, the x position of both test 
masses is controlled by means of the optical readouts. 
A first interferometer measures the relative distance be- 
tween test mass 1 and the spacecraft, x\. This is a rel- 
atively noisy measurement since the noise of the space- 
craft's micro-Newton thrusters appears directly in the 
measurement. A second interferometer measures the rel- 
ative distance between both free falling test masses. This 
channel, that we call in the rest of this paper, will 
be the one giving an unprecedently quiet measurement 
of the differential acceleration (or displacement) between 
two test masses, since the contribution of the thruster 
noise effectively cancels out [T2"] . 

The model of the LTP dynamics control loop is shown 
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in Figure [T] The right panel of this figure shows two 
control loops for the two measurement channels that we 
just described: Xi andx^. This schematic representation 
of the closed loop system can be analytically expressed 
in terms of the following set of equations 13J 



D • q = g, 

g= C ■ {o + Oi) 
o = S-q + o^, 



9n, 



(1) 



where D is the dynamical matrix, C is the controller, and 
S stands for the sensing matrix (the interferometer in our 
case), i.e., the matrix translating the position of a test 
mass, q, into the interferometer readout, o. Subindex n 
stands for noise quantities, either sensing noise (S n ) or 
force noise (g n ) and subindex i stands for the injected 
signals (Si). All of these are 2-dimensional vectors with 
components referring to the xi and xa channels respec- 
tively, 
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The last equation shows how any noisy force applied to 
the spacecraft (g^v) is only measured in the first channel 
(if there were no cross terms). On the other hand, the 
differential channel is sensitive to the difference of force 
noise applied to the first and the second test mass, g\ 
and 52 respectively. 
The matrices read as 



D 



S = 







1 

621 1 







Csus -^sus 



(3) 



where uj\ and W2 are the stiffness — the steady force gra- 
dient across the test mass housing per unit mass [H] - 
coupling the motion of each test mass to the motion of the 
spacecraft; Gdf and G sus are constant factors acting as 
calibration factors of the controller, iJdf and H sus . These 
are the control laws of the loop and will be considered 
known transfer functions in the following; S21 is the in- 
terferometer cross-coupling, a small term accounting for 
the imperfection of the interferometer that will produce a 
spurious signal in the differential channel when only the 
first test mass moves. The interferometer has no cou- 
pling going from oa to o\ and therefore we set 5\2 = in 
the sensing matrix. The previous are the 5 parameters 
that we will consider in the following discussion, the ones 
characterizing the dynamics of the instrument. 



TABLE I. Parameters for the LTP MDC2 model 
Dynamical Parameters 



Parameter 



Value 



Gdf 




0.8 




Gsus 




1.15 








-11 x 10~ 7 




2 
LO-, 




-22 x 10" 7 




S2I 




1.35 x 10~ 4 




Noise Parameters 


Parameter 


Onl/o n A 


g n i/gn2 


9n 


Pi 


3.6 x 1(T 12 


7 x 10" 15 


2.5 x 10 -10 


P2 


10 x 10" 3 


5 x 10~ 3 


12 x 10 -3 


P3 


4.2 


3 


3.8 


m 


1.8 x 1(T 3 


4 x 10~ 4 


1 x 10~ 3 




8 


8 


8 



The leading diagonal terms in Equation ^ describe 
the dynamics of each channel (for example, s 2 + w 2 is 
Newton's law in the Laplace domain for the first test 
mass, with u>\ being the test mass stiffness), and the 
control law (for example, Gdf ff<jf stands for the drag- 
free transfer function controller on the first test mass, 
multiplied by a constant calibration factor, Gdf). The 
off-diagonal terms are the cross-couplings between the 
two channels appearing as triangles in Figure [T] From 
Equation (|3| we can compute the response of the inter- 
ferometer once all the dynamical and noise parameters 
are given as 



(D-S- 1 + C)- 1 (-Co l 



- 9v 



D S 



(4) 



This equation describes the interferometer output and 
will be the variable that we will use to evaluate the in- 
terferometer response. It may be useful to express the 
nominal output as a signal and two noise terms: 



0= G.(0) Si + G no (6) o n + G ng (9) g n , 



(5) 



where O = {Gdf, Gdf, w 2 , w|, ($21} are the unknown model 
parameters we are interested in determining. 

Our model can be thought of as a first term which fil- 
ters the input signal (Si) and two further terms which 
filter the noise. It must be stated that, since our fi- 
nal aim is to characterize the noise model, the noise 
terms also contain information about our parameters. 
But, since we will be working in a high signal-to-noise 
ratio (SNR) regime, we will not consider this depen- 
dence in our analysis and we will further simplify the 
model with the approximations G no (w, O) ~ G no (w) 
and G ng (oj,0) G ng (cj). This allows us to rewrite 
Equation ([5| as 



G s (6)o 4 



(6) 



where n now represents the overall noise of the instru- 
ment. The first term then contains all the model depen- 
dence that we will be able to test with our experiments. 
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The transfer function in this formulation now has the 
following components 



10 



G] 1 
Gl 1 



G 



22 



ojf - lu 2 + G df H d^ (uJ) , 

o, 

G df g d f {u>% - S 2 i (oj 2 - cjj)) 
(u>l -cj 2 + G d{ H d{ ) (cj 2 -uj 2 + G l{s H l{B 
G\f s H lis (uj) 



LJ% - UJ 2 + G]isH]Sa(u>) 



same channel (i.e., testing the diagonal terms), 
able to determine either {G d i,0J 2 } or {G sus ,ui 2 }, and it 
is through the non-diagonal (cross-coupling) term that 
we can determine the 621 parameter and the difference 
between stiffnesses, u> 2 — u> 2 . The experiments in this 
MDC were designed to test these possible combinations 
of injected signals, as described in the following. 



B. Model parameters 



(7) 


S3 






(8) 


£10 

>< 


~ v (9) 


I Dens 




TO 


(10) 


U 

£10 






; in the 


kin pi 


we are 





io" 14L 

0.0001 



t j 

/ 1 
/ / 

------ -;^s^A\ 






S x1 
®x12 

x1:xA 

---model 


* * 

/ / \ 


\ ,~ T ri|| 







i 

... 




Tip 

% 


/ 

t 

* 







0.01 1 

Frequency [Hz] 



20 



FIG. 2. Amplitude spectral density of a noise realization of 
the LTP MDC2 noise model compared to analytical curves. 
We compare the noise of the first channel (S x i), the second 
channel (S x a) and the absolute value of the cross-spectra be- 
tween both (Sxl:xA) • 



Our model is defined by a total of 30 parameters, which 
can be divided into two groups: noise parameters and dy- 
namical parameters. The first ones are those ones used to 
set the noise shapes of the individual noise contributions 
— force noise g n and interferometer read-out noise o n in 
Equation ^ — that will set the final instrument noise 
level. Each contribution is described as 

S(w)=pl [ 1 + 7 \ P3 + , \ P5 ) , (11) 



and therefore 5 parameters are required for each of them, 
for a total of 25 to describe all noise contributions. We 
need to add to these the 5 parameters that characterize 
the joint dynamical behavior of the spacecraft and test 
masses. Only the latter will be the parameters that we 
will be interested in recovering from the data in this chal- 
lenge. As stated above these are: stiffness for each test 
mass (u! 2 ,ui 2 ), calibration for each controller (Gdf,G S us) 
and interferometer cross-coupling (<$2i)- 

Table U contains all numerical values used in the second 
Mock Data Challenge, and therefore fully characterizes 
the model. Although the model allows for different noise 
levels for x\ and interferometer noise, we did not use 
this degree of freedom and set both interferometers to 
behave equally The same applies to the force noise acting 
on both test masses. 



C. Experiments 

Three experiments were proposed for MDC2. These 
were originally motivated by first studies about the sen- 
sitivity attainable by injected signals during the mis- 



sion [13] and correspond to a frequency sweep in the mea- 
surement bandwidth at four different frequencies. Our 
experiments in MDC2 consider only the possibility of in- 
jected signals as simulated interferometric signals, the 
so-called interferometric bias, which we have labelled in 
Equation ^ and in Figure [l] as a;. LISA Pathfinder 
will allow other kinds of injected signals, for instance, 
forces applied to the spacecraft via the thrusters or forces 
directly applied to the test masses via the capacitive 
sensors but, as stated above, it is not the aim of this 
work to explore all capabilities of the mission. In that 
sense, extending the analysis to include all possible injec- 
tion signals is one of the aims of the forthcoming LISA 
Pathfinder MDCs. The three proposed experiments for 
this challenge were the following: 

Experiment 1: Two signals are injected independently 
into the first and second channel. Each signal is 
a sequence of sinusoids with different amplitudes, 
frequencies and duration, all of them known to the 
data analysis team. This experiment is the richer 
in terms of frequencies injected to the system, and 
the one with best expected parameter estimates, as 
we show in the following section. 

Experiment 2: A signal is injected in the first chan- 
nel and both test mass stiffnesses are set to the 
same value, different than the value for the two 
other experiments. This configuration represents 
the matched stiffness configuration in the real LISA 
Pathfinder satellite. This state can be achieved 
by commanding an equal bias voltage on the elec- 
trodes of the inertial sensors at a level which dom- 
inates all other stiffness effects thus resulting in an 



5 



Experiment 1 




,x10 





input o M 

input o jA 





















2000 4000 6000 8000 1000012000 
Time [s] 

Experiment 2 




2000 4000 6000 8000 10000 
Time [s] 






_input o.. 























3500 7000 10500 14000 
Time [s] 

Experiment 3 




3500 7000 10500 14000 
Time [s] 




,x10 




.x 10 



I 05 

I 

E 

< -0.5 



2000 4000 6000 8000 1000012000 
Time [s] 





ll 


1 1 




output o\ 






















1 




': 11 












11 









2000 4000 6000 8000 1000012000 
Time [s] 



FIG. 3. The three MDC2 experiments. From left to right: scheme of injected signal, input signal and output signal. From top 
to bottom: experiment 1, 2 and 3. Only X12 output is shown for experiments 2 and 3, the response of the first channel to the 
injected signal is similar to the one shown in experiment 1. 



equal coupling between the two test masses and 
the spacecraft. This scheme is particularly useful 
since it would ideally decouple any external force 
from the differential measurement. However, in our 
simplified model there is already a second cross- 
term, the interferometer cross-coupling, S21, mix- 
ing both channels — see Equation ([9]). Being the 
only remaining cross-coupling in this experiment, 
this parameter should therefore be obtained with 
the greatest accuracy when analyzing this data set. 

Experiment 3: The last experiment again applies only 
one signal to the first channel but without match- 
ing the stiffness for both test masses. This exper- 



iment tests the ability to recover the same param- 
eters that we determine in Experiment 1, but by 
only injecting signals into the first channel. 

The data set in MDC2 also included a run without 
any injected signal from where the instrument perfor- 
mance could be evaluated. A typical noise realisation for 
this model is shown in Figure [2] whereas the three MDC2 
experiments are represented in Figure |3j all of them gen- 
erated using LTPDA methods. The concept behind the 
data generation process is to translate the transfer func- 
tions appearing in Equation ([5]) into digital filters, and 
then use those filters to translate the input signal into the 
measured output. Since the measured data is a combi- 



G 



nation of signal and noise, the data generation procedure 
is consequently split into two branches that are added at 
the end. The generation of the signal part is straightfor- 
ward since it only requires the filtering of a deterministic 
signal. In contrast, the noise part requires the use of 
digital filters to color white-noise and to do it in such a 
way that the noise cross-correlation properties between 
the two channels are correctly reproduced. A detailed 
description of this process can be found in [T3] . 



III. DATA ANALYSIS 

A. Bayesian estimation 

We would now like to infer unknown parameters from 
the simulated data. To this end we need to derive the 
posterior probability distribution of the parameters, that 
is, the conditional probability distribution of the param- 
eters for the given data at hand. The posterior distribu- 
tion expresses the information about the parameters by 
assigning probabilities across parameter space, and by 
that allows us to derive the most likely values and their 
uncertainties [151 116) . The posterior distribution is given 
by Bayes' theorem, and it depends on the data as well as 
any other prior information 7: 



P(6|D,J) 



P(9|7) x P(7>|9,7) 



oc P(9|7) x P(7>|9,7). 

The prior probability distribution P(9|7) expresses in- 
formation we may have about the parameter values 
(in addition to the data D), while the likelihood func- 
tion P(7J|9, 7) describes the probabilistic relationship be- 
tween parameters and the (noisy) measurements. The ev- 
idence P(7?|7) is usually not of concern for parameter es- 
timation purposes and constitutes a normalizing constant 
here. In this work we will assume uniform prior distribu- 
tions for all parameters, i.e., the prior density P(9|7) is 
constant across the allowed region as defined in Table [I] 
Given the simplified model in Equation (|6| we start 
by assuming that the noise term ft is Gaussian. The 
noise in each of the two output channels is characterized 
by the (known) one-sided power spectral density func- 
tions 5 x i (/) and S , x a(/), respectively. In addition, the 
noise is assumed to be correlated between the two out- 
puts, which is expressed through the cross spectral den- 
sity S'x1:xa(/)- Due to the colored noise it will be con- 
venient to express the likelihood function in terms of the 
Fourier transformed data. The likelihood function then 
is given by 



p(D\S,I) = \(2n) N / 2 detS 



-1/2 



(13) 



x exp 



i(o-G s (6)o 4 ) T S- 



(o-G s (@)oi 



so that (up to a multiplicative factor) the logarithmic 
likelihood is proportional to the quadratic form 



log(p(£>|e,/)) cx -i(o-G s (e)o 4 ) T S- 1 (o-G s (9)o I ). 

(14) 

where X is the covariance matrix of the (Fourier domain) 
noise term ft. The covariance matrix entries are then 
defined by the spectral and cross-spectral density values 
corresponding to the Fourier frequencies. Most of S's 
entries are zero (since only the terms corresponding to the 
same Fourier frequency are correlated) and the quadratic 
form may be rearranged so that S is of a block-diagonal 
form and the likelihood expression simplifies to a sum 
over the blocks of correlated terms at each frequency bin: 



log(p(7J|9,7)) cx -^Rel 



rj), (15) 



where j = 0, . . . , N/2 is an index over the Fourier fre- 
quencies fj, and rj and T,j denote the two (complex- 
valued) residual terms and corresponding covariance ma- 
trix at frequency fj: 



-(G n (e)o il + G 12 (Q)o iA )](f j ) 

-(G 2 i(e) 0il + G 22 (e)o iA )](/ J ) 

Sxl(fj) S'xl:xA(/j)* \ 
'5'xl:xA(/j) S x A{fj) J 




(16) 



B. Optimal parameter estimation errors 

In order to get an idea of what kind of information 
the simulated experiments will provide, we will use the 
Fisher information formalism to estimate the measure- 
ment errors to be expected from the different experimen- 
tal settings. The Fisher information and the correspond- 
ing Cramer-Rao bound (CRB) provide an estimate of the 
measurement uncertainties to be expected in the limit of 
a large signal-to-noise ratio (SNR) [17]. For an unbiased 
estimate of 9, the CRB can be expressed as 



cov(9) > J" 1 (9), 



(17) 



where J(9) is the Fisher information matrix. For our 
particular case it will shown to be useful to use the 
Cramer-Rao bound expressed as [T5] . 
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duj 



doj{to,Q) do k (to,Q) 



S jk (cu,Q) 86 l 



T 
Air 



(lid 



dS jk (u,Q) dS 3k {co,Q) 



dO, 



(18) 



where we sum over the two channels; o x \ and o x a being 
the two components of the nominal output, Sjk(u,®) 
the components of the cross-spectrum matrix and T the 
integration time. We are considering here the parametric 
dependence of the noise terms — Equation Although 
we will drop it in the next step, we want to explicitly state 
that term since it is usually not considered in the Fisher 
matrix analysis among the gravitational wave commu- 
nity |17j . but it may turn out to be relevant in future 
analysis since the noise model characterization is the fi- 
nal purpose of the LTP mock data challenges. However, 
for this first application, and to avoid cumbersome equa- 
tions, we decided not to include those terms considering 
that they will not introduce any relevant information in 
the high SNR regime where we are working. Switching 
therefore to Equation ([6| and substituting into Equa- 
tion (fl8|) leads to 



[J(e)] /m 

E°i,j °i,k 



(19) 



duj 



1 dG jk (G) dG jk (@) 



S jk (u) dQ l 



d9 r , 



where now Oj i and o^a are the two components of the 
input signal and Gj k (&) the components of the transfer 
function. We will use Equation ( 19 ) in the following to 



evaluate the CRB in each experiment. It is important to 
keep in mind that the three experiments analyzed here 
contain different configurations of the instrument, mean- 
ing that both the transfer function elements and the sig- 
nals are changing in each experiment. 



TABLE II. Cramer-Rao bound. Values between parenthesis 
expressed in relative parts per thousand (%c) 



Parameter 


Exp. 1 


Exp 


2 


Exp. 3 


°"G df 


2 x 1(T 5 (0.02) 


5 x 10" s 


(0.06) 


2 x 10" 4 (0.2) 


°"G SU3 


3 x 10" 7 (0.0002) 


3 x 10" 


- 3 (3) 


3 x 10" 4 (0.3) 




6 x 10" 10 (0.5) 


3 x 10" 6 


(1000) 


9 x 10" 8 (80) 




3 x 10" 10 (0.1) 


3 x 10" 6 


(1000) 


9 x 10" 8 (40) 


crs 21 


6 x 10" 8 (0.5) 


4 x 10" 


H (0.2) 


1 x 10" 7 (0.9) 




5 x 10" 10 (0.4) 


6 x 10" 


10 H 


3 x 10" 10 (0.3) 



Table [IT] summarizes the optimal error estimates that 
the data analysis should return. The last column refers 
to the achievable standard deviation in the difference be- 
tween squared stiffnesses, Acj 2 = uj\ — uj\. This will be 
only indirectly estimated by the analysis, but we added it 
to the table, firstly, because the cross-coupling between 
both channels depends directly on this difference, but 
also because the error in the estimation of the stiffnesses 
difference depends on the non-diagonal terms of the co- 



variance matrix. This quantity adds then some more in- 
formation not contained in the other parameters, which 
are extracted purely from the diagonal terms. The <tau 
error is computed as 



&U2 2<7 W 1 tU!2 , 



(20) 



where <j£, and are the variances of the stiffness 
squared of test mass 1 and test mass 2, and <J LJltUl2 is 
the covariance term containing the correlation between 
both stiffnesses. A remarkable result from this analy- 
sis is that a single experiment injecting a signal in both 
channels (experiment 1) is enough to determine all pa- 
rameters with high precision. In fact, this experiment 
is preferable to the other experiments which only inject 
signals in the X\ channel. Only the matched stiffness 
experiment (experiment 2) gives a slightly better esti- 
mation of the interferometer cross-coupling. Precision in 
this parameter is gained however at expenses of increas- 
ing the uncertainty in the determination of the absolute 
value of the stiffnesses, reaching in this case 100%. In 
principle, if we take into account our simplified model, 
experiment 3 would be redundant, not adding more in- 
formation (apart from statistical averaging) than what 
we get from experiments 1 and 2. 

In order to give some more insight in what refers the 
difference between experiments we provide in Table |III| 
the correlation matrices as computed with the previous 
formalism. These results complement the ones in Table 
PH since the diagonal terms of the latter correspond to the 
values reported in the former. Comparison between cor- 
relation matrices show how experiment 1 is disentangling 
the different parameteres dependences more efficiently. 
In particular, it is the only experiment which is able to 
differentiate the contribution of the two stiffnesses. The 
reason for that being that it is the only experiment with 
a signal injected in the differential channel. 



C. Combining the results of experiments 

1. The information propagation problem 

As opposed to the usual application of Bayesian pa- 
rameter estimation in LISA, where a single set of data is 
used to determine the parameters of a multiplicity of sys- 
tems, i.e., astrophysical sources, in our case we use differ- 
ent sets of data (experiments) to characterize a unique 
system, the LTP experiment. Thus, once we have ob- 
tained the parameter estimates for each experiment we 
still need to go further to achieve our final goal. Since 
each experiment can be adding valuable, but partial, in- 
formation about the instrument, we need to find a scheme 



TABLE III. Correlation matrices for MDC2 experiments 





Gdf 


Gsus 


O 


o 


&21 








Experiment 1 






Gdf 


1 


0.0003 


-0.1 


-0.001 


-0.2 


Gsus 


0.0003 


1 


-0.3 


-0.5 


-0.001 


9 

ul 


-0.1 


-0.3 


1 


0.5 


0.5 


wf 


-0.001 


-0.5 


0.5 


1 


0.005 


5-21 


-0.2 


-0.001 


0.5 


0.005 


1 








Experiment 2 






Gdf 


1 


0.4 


-0.6 


-0.6 


0.2 


Gsus 


0.4 


1 


-0.7 


-0.7 


0.3 


O 

ul 


-0.6 


-0.7 


1 


« 1 


-0.4 


Q 


-0.6 


-0.7 


w 1 


1 


-0.4 


5-21 


0.2 


0.3 


-0.4 


-0.4 


1 








Exp6rim6irt 3 






Gdf 


1 


0.03 


-0.02 


-0.02 


0.04 


Gsus 


0.03 


1 


-0.8 


-0.8 


0.3 




-0.02 


-0.8 


1 


« 1 


-0.09 




-0.02 


-0.8 


w 1 


1 


-0.09 


S 2 1 


0.04 


0.3 


-0.09 


-0.09 


1 



that allows us to include all the information in a final set 
of parameters. 

The efficient combination of results is also an impor- 
tant problem to solve in terms of mission operations. It 
should be noted that the LISA Pathfinder mission will 
be a space laboratory with approximately 100 channels 
being sampled and more than 50 parameters defining its 
performance. It will therefore be crucial to combine the 
results from one experiment with the ones following. For 
instance, we may be interested in using the determination 
of the stiffness to calibrate the thrusters in a forthcom- 
ing experiment. Given the limited mission time and the 
high numbers of experiments to be performed, the need 
for a clear combination scheme is evident. We explore 
in the following how to take advantage of the posterior 
distribution to that end. 



2. The general case 

a. Identical parameter sets First consider the case 
where the parameter sets are identical for the data sets 
to be combined (as e.g. in Experiments 1 and 3 above). 
Suppose we have a parameter vector and two data sets 
D\ and D 2 . Similar to the general case in Equation ( 12 1, 
the posterior distribution P(0|£>i, D 2 ,I) is then given by 



prior 



likelihood 



P(9|£>i, D 2 , 1) ex PjgjJ) x PQDije, I) x P(D 2 \e, I) , (21) 

prior likelihood 

where the same expression may be motivated by either 
taking the likelihood to be the product of the individual 



experiments' likelihoods or by analyzing the experiments 
one after the other and using the posterior from the first 
experiment as the prior for the second experiment (21). 

b. Differing parameter sets In order to deal with 
differing parameter sets that only partially overlap, one 
needs to consider the union of all the unknowns as the 
set of parameters. Combining data from different exper- 
iments then works exactly as in Equation ( |21[ ), only that 
the parameter vector is now the extended parameter 
set. The likelihood functions are exactly the same as in 
the individual-experiment case, with the only difference 
that, as functions of the extended parameter set, they do 
not depend on some of the parameters. 

Consider the case where two data sets D\ and D 2 de- 
pend on parameter while the parameters ^2 and $3 
are specific for D\ and D 2 , respectively. Assuming the 
error terms for both experiments to be independent, the 
joint likelihood function then is the product 

P(Di,I>a|0i,02,i>3,/) 
= P(Di|0 1 ,0 2 ,0 a ,J) x P(D 2 |0i,02,03,/), 
= P(D 1 \# 1 ,# 2 ,I) xP(D 2 \ti u $ 3 ,I). (22) 

In order to simplify things, in the following we will 
introduce the assumption that the conditional prior 
P($2|$i, $3, 1) is independent of $3, i.e., 



P(02|01, 03,-0 = PWl,/) 



(23) 



(since $2 and $3 were the parameters which did not 
jointly affect both experiments, this may be easily satis- 
fied, for example if P(tf x , tf a , 3 |I) = P(i?i|J) x P(i? 2 |I) X 
P(# 3 |I)). When considering additional data D 2 , the 
change in the (marginal) posterior distribution of the two 
parameters #1 and ~d 2 then is given by 



P(0i,0 2 |I>i,£2,J)=P(0i,0a|-Di,i) x 



P(fll|£>2,J) 

P(^IJ) 



so that in order to "update" the posterior distribution 
of $1 and $2 using the data D 2 that depends on the 
additional parameter i9 3 , we only need to consider the 
marginal prior and posterior distributions of the com- 
mon parameter di, P(t?i |J) and P($i\D 2 , 1). We can 
see that when updating the posterior by another poste- 
rior (24), the (marginal) prior needs to be cancelled out, 



otherwise it would enter twice into the resulting poste- 
rior. Since by combining the posteriors we will only learn 
about the common parameter $1 here, it will be easier 
to also integrate out "& 2 and only consider the (marginal) 
distributions involving $1, which then leads to 

P(# 1 \D 1 ,D 2 , 1) = P^ipi,/) X P p/]y • (25) 

The higher-dimensional case works completely analo- 
gously, just by considering the parameters $1, $ 2 , $3 to 
be sub-vectors. 
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3. The Gaussian approximation 

As we will see below, the derived posterior distribu- 
tions often turn out to be well approximated by a mul- 
tivariate Gaussian distribution with mean p and covari- 
ance matrix E: 

p(x.\D) « p(x;/z,E) 

= (2 ^ I7 ,ex P {-i(x-,rS-(x-,)}.(26) 

If the posterior distributions are expressed as Gaussians, 
it is particularly easy to analytically propagate prior and 
posterior information as described in the previous subsec- 
tion; in the following we will therefore apply these results 
to the Gaussian case. As a further simplification, we will 
also assume all prior distributions to be uniform. 

a. Identical parameter sets In order to combine the 
results coming from two experiments D\ and D2, we will 
need to combine their two posterior distributions as in 



covariances) and use those to combine the marginal pos- 
teriors as in Equation ( 25 ) and in the previous section. 



Equation (21 ). The results from experiments D\ and D2 



will be summarized by parameters' posterior means and 
covariances {fJ-i, Si} and {p2, E2}, respectively. Assum- 
ing uniform priors, we can now combine both as 

p(x\D 1 ,D 2 )=p(x\D 1 ) xp(x|£> 2 ) 

= p(x;/ii,£i) xp(x;/i 2 ,S 2 ) 

= p(x; Mc ,E c ), (27) 

i.e., the product of posterior densities again is Gaussian 
with mean fi c and covariance E c . The parameters of 
the combined posterior may then be derived using the 
following relationship 

(x - u) T U _1 (x - u) + (x - v) T V _1 (x - v), 
= (x- w) T W- 1 (x- w), (28) 

where 

w = W" 1 [Uu + Vv], W = U + V, (29) 
so that the new mean and covariance turn out as 

S- 1 = Si 1 + S 2 1 (30) 
Mc = E c [Hi V + 5£ X A*a] (31) 

|15j . The same argument is easily extended to an arbi- 
trary number N of experiments as 



N 



N 



= e n ]Te-V 



(32) 



(33) 



b. Differing parameter sets Now suppose we have 
results of two experiments in which the parameter 
sets were not quite identical, as in the previous Sec- 
tion |III C 2 b] One may now either directly derive esti- 
mates of the marginal distribution (i.e., their means and 



Otherwise, if given only the joint distributions (means 
and covariances) of the differing (but intersecting) pa- 
rameter sets, these may also be marginalized analytically. 
For a Gaussian distribution the marginal distribution of 
a subset of the variables is simply given by the corre- 
sponding subset of mean and covariance parameters, i.e., 
by dropping the rows and columns corresponding to the 
variables that are integrated out. 



D. Implementation 



Our implementation follows a four-step procedure to 
analyze each experiment, all of them implemented as 
LTPDA methods. The first step is to Fourier transform 
the data. The noise's power spectral density is estimated 
using the Welch method [T5] and applying a Blackman- 
Harris window. We can then compute the log-likelihood 
( |13| and therefore find the maximum of the posterior 
density function using a (Nelder-Mead) simplex search 
algorithm |20j . Since with our strong signal injections 
the likelihood surface apparently does not tend to exhibit 
many secondary maxima, this step is usually sufficient to 
determine the parameters to good accuracy and it is also 
more efficient than waiting for the Metropolis sampler to 
converge. However, if the likelihood surface shows sec- 
ondary maxima, this method may lead to an erroneous 
result. Next, the posterior covariance among parame- 
ters according to input signals, noise and the relevant 
transfer functions, is estimated by numerically evaluat- 
ing the Fisher information matrix at the maximum deter- 
mined in the previous optimization step. And finally, we 
can integrate the posterior using a Markov Chain Monte 
Carlo (MCMC) approach. We use a Metropolis algo- 
rithm |15U21) that will generate random samples from the 
parameters' (5-dimensional) posterior distribution. Gen- 
eration of these samples is relatively easy based only on 
the expression of the (unnormalized) posterior density 
function ((pj or Q). 



In order to enhance convergence of the MCMC sam- 
pler, we apply tempering to the posterior density func- 
tion, which is supposed to make it more tractable and 
keep the algorithm from getting stuck in local optima. 
In the MCMC context, tempering is commonly imple- 
mented by applying an exponent to the probability den- 
sity to be sampled from, i.e., instead of using the pos- 
terior p(9\D,I), the tempered posterior p(0\D, I)t is 
considered, where T > 1 is the "temperature" [2"Tl 122). 
The y exponent smoothens the targeted density func- 
tion, which generally allows the sampler to move more 
quickly and widely through parameter space and to tra- 
verse between local modes more easily. The following 
expression describes the temperature profile used in our 
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TABLE IV. Estimated parameters for independent experiments. 



Par am. 


Value 6 


Estimated 9 ± a 


\9 - 9\/a 


f/^CBB 






Experiment 1 






Gdf 


0.8 


0.800 02 ±0.000 02 


1.0 


1.0 


Gsus 


1.15 


1.150 0001 ±0.000 000 3 


0.4 


0.9 


o 

Wl 


-1.1 x 10~ 6 


(-1.0991 ±0.000 5) x 10~ 6 


1.7 


1.0 


2 
UJ' 2 


-2.2 x 1(T 6 


(-2.2001 ±0.000 3) x 10~ 6 


0.3 


1.0 


S21 


1.35 x 1CT 4 


(1.350 2 ±0.000 6) x 10" 4 


0.3 


1.0 


A 9 


-1.1 x 1(T 6 


(-1.1010 ±0.000 5) x 10~ 6 


2.1 


1.0 






Experiment 2 






Gdf 


0.8 


0.800 11 ±0.000 05 


2.2 


1.0 


Gsus 


1.15 


1.147 ± 0.004 


0.8 


1.0 





-2.4 x 10~ 6 


(-5 ±3) x 10~ 6 


0.8 


1.0 




-2.4 x 10~ 6 


(-5 ±3) x 10~ 6 


0.8 


1.0 


^21 


1.35 x 10~ 4 


(1.349 7 ±0.000 3) x 10" 4 


1.0 


0.9 


A 9 





(-3 ±6) x 10~ 10 


0.5 


1.1 






Experiment 3 








0.8 


799 8 ± 000 2 


1.2 


1.1 


Gsus 


1.15 


1.150 3 ±0.000 3 


0.8 


1.0 




-1.1 x 10~ 6 


(-1.25 ±0.09) x 10" 6 


1.7 


1.0 


2 


-2.2 x 10~ 6 


(-2.35 ±0.09) x 10" 6 


1.7 


1.0 


^21 


1.35 x 10~ 4 


(1.350 ±0.001) x 10~ 4 


0.3 


1.0 


Aw 2 


-1.1 x 10 -6 


(-1.0999 ±0.0003) x 10" 6 


0.2 


1.0 



implementation 



T 



with estimation uncertainties roughly following the cor- 



10^V~^J l<i<T h 
lO'K 1 -*) T h < i < T c 
1 i > TV, 



(34) 



with i indexing the samples of the Metropolis chain. We 
initially applied a constant temperature (with £ = 3) for 
the first 1000 iterations (Th = 1000), which was then 
exponentially annealed down in the following 1000 iter- 
ations (T c = 2000), after which the algorithm was prop- 
erly generating samples from the actual posterior distri- 
bution. To reduce the time required during the search 
phase we occasionally rescale the covariance matrix of 
the proposal distribution to explore a wider region of the 
parameter space. Also, as proposed in |151 123] , we cor- 
rect the standard deviation of the proposal distribution 
with a factor of d -1 / 2 , where d is the parameter space 
dimension. 



E. Results and discussion 

Figure [4] illustrates the marginal posterior probabil- 
ity density functions of the individual parameters based 
on the different experiments. Parameter estimates are 
shown in Table |IV[ together with a comparison of the es- 
timated error and the Cramer-Rao bounds, as derived in 



Section III B The parameters are recovered successfully 



responding CRB, as shown in the last column of table IV 
The worse estimate appears to be a ~ 2a deviation on 
the Gdf parameter in experiment 2. This result is still 
consistent with the true value used to generate the data. 
However, to further investigate this feature we generated 
a new set of data using the same tools and parameters. 
The analysis of the new data did not reproduce an offset 
estimate, whence we discarded a systematic bias on Gdf 
parameter in experiment 2. 

As expected, the best estimates come from the first 
experiment since the signal is richer in that case. The 
fact that a signal is injected on both channels makes this 
experiment the most sensitive in terms of the determina- 
tion of the stiffness difference between both test masses, 
reaching indeed the CRB, and obviously translating into 
a better estimate for the remaining parameters. 

Only the second experiment allows a better estimation 
of one of the parameters, 821, since in this case we are 
canceling the second cross-coupling term, uj\ — w 2 , by 
forcing stiffnesses from both test masses to have the same 
value. As expected, the absolute value of the stiffness 
can not be determined accurately in such a case. The 
reason being that the matched stiffness configuration is 
precisely designed to make the experiment insensitive to 
stiffness differences, which naturally turns into a poor 
estimation of the parameter. It is however remarkable 
that, thanks to the cross- variance terms, we can have 
a good determination of the difference between the two 
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stiffnesses, which should be identically zero in this case. 
That's indeed the value retrieved by our analysis with an 
uncertainty of 7 x 10~ 10 s~ 2 . 

It is worth comparing here the results obtained with 
the analysis to measured quantities. Although the nu- 
merical values may differ, it may be relevant to com- 
pare the uncertainties of the values in order to check 
that our model is in quantitative agreement with exper- 
iments being performed. To do so we take the stiffness 
as our figure of merit since it has been extensively char- 
acterized in the torsion pendulum facility |24j . Recent 
experiments in this facility report a remnant stiffness 
coupling the test masses to the surrounding GRS pro- 
totype of (2.5 ± 0.1) x 10 9 N/m [22]. When scaled by 
the mass of the LTP test masses (1.96 kg) so to be ex- 
pressed in terms of force per unit mass, these figure be- 
comes (1.28 ± 0.05) x 10 9 s~ 2 , which could be compared 
to the uncertainty in the estimation of the stiffness in 
our model, which reaches 3 x 10 -10 s -2 for the second 
test mass stiffness in experiment 1. The simplified noise 
model that we used for the analysis therefore seems to be 
consistent with the numbers coming from experiments. 
Both numbers are, however, orders of magnitude below 
the required remnant stiffness on board the satellite of 
14 x 10" 7 s- 2 [26]. 

The data analysis during LISA Pathfinder operations 
will be strongly conditioned by the operations schedule. 



In Section III C we describe how to exploit the posterior 
distribution in order to combine results from different 
experiments. We applied that scheme to our results in 
order to produce a unique set of parameters for both 
cases previously described: all parameters being identi- 
cal (experiment 1 & 3) and experiments with different 
numerical values of the parameters (combining all exper- 
iments) . Given the approximately normal distribution of 
the parameters that we get from the Monte Carlo inte- 
gration in Figure [4j it is justified to apply the Gaussian 
formalism that we introduced in Sectio n |III C| In partic- 
ular, we just need to apply Equation (25) to our set of 
experiments. Results in Table [V] show an improvement 
in the uncertainty of the estimate. According to (21 1, 



the same scheme could be obtained by considering the 
posterior distribution of one experiment as a prior for 
the following one. This would also improve the conver- 
gence time of the search, which could be an important 
consideration during operations. 



IV. SUMMARY AND FUTURE WORK 

We have shown how a Markov chain Monte Carlo 
method can be used for parameter estimation in the LISA 
Pathfinder mission. In order to demonstrate so, we gener- 
ated data from a simplified model of the main experiment 
on board the mission, the LTP. This data set contains 
runs where we injected signals to test the instrument, 
which must allow the recovery of the parameters, and 
also some runs without any injection, used to evaluate 



TABLE V. Combination of results for different experiments. 
Two values are reported when combining all experiments for 
parameters uf, wf and Aco 2 . The top one is the result ob- 
tained by combining the values for experiment 1 & 3, the 
bottom one corresponds to the matched stiffness experiment. 



Parameter 



Estimated 



Experiment 1 & 3 


All experiments 


Gdf 


0.800 02 ± 0.000 02 


0.800 03 ± 0.000 01 


G sus 


1.150 0001 ± 0.000 000 3 


1.150 000 9 ±0.000 000 3 


(x 1CT 6 ) 


(-1.1000L 0.0004) 


(-1.100 ±0.000 4) 

(-5 ±3) 


2 

(x icr 6 ) 


(-2.2001 ± 0.000 3) 


(-2.200 ±0.000 3) 

(-5 ±3) 


Sl2 

(x 1CT 4 ) 


(1.349 8 ± 0.000 5) 


(1.349 67 ±0.000 02) 


(x icr 6 ) 


(-1.100 2 ± 0.000 2) 


(-1.1002 ±0.0002) 
(-0.0003 ± 0.0006) 



the noise performance of the instrument. We think that 
the model used in our analysis serves as a complementary 
approach to the already existing LISA simulators, since 
it includes some more detail in the test mass dynamics 
and its coupling to the test mass motion, precisely one of 
the key points that LISA Pathfinder aims to investigate. 

The analysis presented here includes an estimate of the 
optimal error achievable (for an unbiased estimate) for a 
given injected signal and a configuration of the experi- 
ment. These results are of relevance for the mission since 
they show that it is as important to develop data anal- 
ysis tools as to to carefully design the experiment to be 
performed in flight. With our model, a different injec- 
tion signal showed to improve two orders of magnitude 
the estimation of the test mass stiffnesses — results for 
experiment 1 and 3 in Table [TTJ Although the expected 
parameter uncertainties in the real mission will be larger 
than the ones reported here, the dependencies on the pa- 
rameters are representative. Thus, the decrease on the 
optimal error could be applicable to the real mission as 
well. We will need however to confirm this result with 
more realistic models. 

The method developed here to analyse the data reaches 
roughly the optimal attainable error for each single ex- 
periment. The combination of the results for different 
experiments obviously reduces the uncertainty on the pa- 
rameters, reaching lower errors than the ones originally 
derived from the Cramer-Rao bound for each indepen- 
dent experiment. When combining different experiments, 
our analysis took advantage of the gaussian posterior ob- 
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tained during the sampling of the likelihood surface, so 
that a simple algebraic operation between gaussian dis- 
tribution was enough to derive a combined estimate of all 
experiments. However, the framework is general enough 
to include non-gaussian profiles, given that the full pro- 
file of the posterior is obtained during the sampling of 
the likelihood surface. 

The combination of estimates was performed here as 
an off-line operation, i.e. after all experiments were anal- 
ysed. A natural extension to this work would be to use 
the posterior distribution for a given experiment as prior 
for the next one, as motivated in Equation ( pTj ). This 
concept of a chain of experiments is particularly suitable 
for LISA Pathfinder since, during flight operations, we 
will naturally need to include results of previous exper- 
iments in the next foreseen ones. In other words, if the 
test mass stiffnesses are clearly determined in an experi- 
ment we may want to use that information for forthcom- 
ing experiments in order to effectively reduce the dimen- 
sion of our problem. The method described here pro- 
vides a way to include this information in the analysis in 
a clear way. Moreover, the capability to use this infor- 
mation could be a powerful advantage during operations 



due to the reduction of convergence time that it implies. 

An increase in the uncertainty on the estimates is to 
be expected when dealing with a more realistic model 
due to the increase in dimensions of the parameter 
space. This is precisely the step that we will face in the 
forthcoming activities in preparation for the LTP data 
analysis. Our aim is to study in detail the experiments 
defined to be implemented in flight, now that the basic 
functionality of the parameter estimation tool is already 
demonstrated. In that sense, next steps will include 
a three dimensions model and more complex injected 
signals, that will make use of the full capabilities of the 
spacecraft. This work is ongoing and will be presented 
in due time. 
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FIG. 4. Histograms of the MCMC samples illustrating the individual parameters' marginal posterior probability distributions 
as computed with the last 3500 samples of the chain. All histograms are plot with the same y axes range, up to 250 counts. 
Black vertical lines illustrate the true parameter values. Parameters Gdf , G sus and 821 are dimensionless; dimensions for stiffness 
parameters are [wjl = [wf] = s -2 . 



